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A DYNAMIC DOMAIN DECOMPOSITION 
FOR A CLASS OF SECOND ORDER SEMI-LINEAR EQUATIONS 

SIMONE CACACE AND MAURIZIO EALCONE 


Abstract. We propose a parallel algorithm for the numerical solution of a class of second order 
semi-linear equations coming from stochastic optimal control problems, by means of a dynamic do¬ 
main decomposition technique. The new method is an extension of the patchy domain decomposition 
method presented in [6] for first order Hamilton-Jacobi-Bellman equations related to deterministic 
optimal control problems. The semi-Lagrangian scheme underlying the original method is modified 
in order to deal with (possibly degenerate) diffusion, by approximating the stochastic optimal control 
problem associated to the equation via discrete time Markov chains. We show that under suitable 
conditions on the discretization parameters and for sufficiently small values of the diffusion coeffi¬ 
cient, the parallel computation on the proposed dynamic decomposition is faster than that on a static 
decomposition. To this end, we combine the parallelization with some well known techniques in the 
context of fast-marching-like methods for first order Hamilton-Jacobi equations. Several numerical 
tests in dimension two are presented, in order to show the features of the proposed method. 
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1. Introduction. Domain decomposition techniques have become popular in 
the solution of partial differential equations arising in several applied contexts, in¬ 
cluding fluid dynamics and electromagnetism. 

Given a decomposition of a domain in subsets of manageable size and prescribed 
suitable transmission conditions at the interfaces or overlapping regions between the 
sub-domains, massive parallel computations can be performed exploiting the com¬ 
puting power of modern clusters of CPUs and/or GPUs. This is the most common 
strategy to attack the so called curse of dimensionality, a quite general term to denote 
the difficulties related to the numerical approximation of problems in high dimension, 
both in terms of memory requirements and computational efforts. New efficient and 
accurate parallel algorithms are increasingly required, to open the way to the solution 
of real-life problems with a very huge number of degrees of fredom. 

Despite such techniques have been mainly designed, analyzed and implemented in 
the framework of variational methods for elliptic and parabolic equations, in the last 
decades an increasing interest has also emerged in fields more related to hyperbolic 
equations, e.g. optimal control problems, differential games, front propagation and 
image processing. In this setting, a leading role is played by the theory of viscosity 
solutions for Hamilton-Jacobi equations, representing a solid theoretical background 
for a growing number of numerical methods for the solution of problems in robotics, 
aeronautics, electrical and aerospace engineering. 

In the authors proposed the patchy domain decomposition method, a par¬ 
allel algorithm for the solution of deterministic optimal control problems, based on 
a semi-Lagrangian discretization of the corresponding first order Hamilton-Jacobi- 
Bellman equations. The main idea there can be summarized as follows. First the 
solution of the equation is computed on a grid which is coarse with respect to the 
actual (fine) grid. Then, an approximation of the optimal vector field associated to 
the underlying control problem is synthesized on the fine grid. Finally, a decompo¬ 
sition of the fine grid is dynamically build driven by the approximate optimal vector 
field, and the equation is solved in parallel on the corresponding sub-domains. This 
pre-computation has clearly a cost and produces a rather complex subdivision of the 
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computational box, but gives the fundamental property that each sub-domain is, up 
to an error, independent on the others. This feature allows for an efficient paral¬ 
lelization, since transmission conditions at the interfaces of the sub-domains can be 
completely avoided. The computational resources can be better employed, so that 
no processor remains idle or computes useless information, thus saving CPU time to 
reach convergence. Despite the pre-computation step and the non trivial implementa¬ 
tion, the overall performance of the patchy decomposition overcomes that of a static 
domain decomposition. 

Another technique widely used to reduce the computational efforts when comput¬ 
ing the solution to first order Hamilton-Jacobi equations, is to exploit the so called 
causality property. This term refers to a peculiarity of hyperbolic equations, namely 
the fact that, starting from the boundary data, information propagates in the domain 
along characteristics at finite speed. Fast-marching methods [UlllilllS] have been 
developed trying to reproduce this feature at the discrete level. The main idea is 
to process the grid nodes in a suitable order that decouples, at least partially, the 
nonlinear system corresponding to the discretization of the equation. Moreover, the 
computation is localized, at each iteration, on the nodes that bring the relevant in¬ 
formation. Due to the causality property, these nodes are very few, compared to the 
size of the whole grid. In this way, the solution can be computed in a cascade fash¬ 
ion, accelerating the convergence of the underlying scheme significantly. It turns out 
that each node converges in a predetermined and finite number of iterations, only one 
iteration in the most favorable cases. For this reason fast-marching methods are also 
termed single-pass methods. 

Keeping these ideas in mind, in this paper we revisit the patchy domain decompo¬ 
sition in the context of second order semi-linear equations. We try to extend the main 
features of the method to a class of problems where diffusion appears, namely sta¬ 
tionary nonlinear Hamilton-Jacobi-Bellman equations coming from stochastic optimal 
control problems. A prototype problem is the following: 



where e > 0 is the diffusion coefficient, A is a compact set of admissible controls, / 
is the controlled dynamics driving the system (aka the controlled advection), I is a 
source term and g is a boundary datum. 

The semi-Lagrangian local solver of the original patchy method has to be modi- 
hed in order to deal with diffusion. Indeed, the presence of diffusion invalidates the 
connection with deterministic optimal control, in the sense that characteristic curves 
associated to the hyperbolic equation are no longer well defined in this more gen¬ 
eral setting. Nevertheless, a weak notion of characteristics can be still provided via 
stochastic differential equations, interpreting the diffusion as a Wiener process. Then, 
a semi-Lagrangian discretization of the corresponding semi-linear equation can be still 
performed, approximating the stochastic process via discrete time Markov chains. The 
resulting scheme is able to compute the solution of equations with very degenerate 
diffusions. This is a very delicate problem in the community working on advection- 
diffusion equations, especially in cases of interest where the diffusion is much smaller 
than the advection (e.g. Navier-Stokes equations in fluid dynamics). Indeed, it is well 
known that a degeneracy in the diffusion translates into a lack of coercivity of the 
variational form associated to the equation. In particular, discretizations based on 
finite elements need to be stabilized with ad-hoc procedures, in order to capture the 
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boundary or internal layers that typically arise in such equations. 

In this more general setting the application of the patchy domain decomposition 
is not trivial. The main obstacle is that the transmission conditions at the interfaces 
cannot be avoided, due again to the presence of diffusion. A natural question arises: 
is the patchy domain decomposition still favorable than a static domain decomposi¬ 
tion? We show that the answer to this question depends on the ratio between the 
controlled advection and the diffusion coefficient, a characteristic quantity similar to 
the Reynolds number for Navier-Stokes equations. More precisely, we show that if the 
controlled advection dominates diffusion and the discretization parameters are chosen 
in a suitable way, then a parallel computation on the dynamic decomposition can still 
be faster than that on a static decomposition. To reach this achievement, we need 
to exploit all the information collected in the pre-computation step of the dynamic 
decomposition. In particular, we try to employ, despite the diffusion, the causality 
property of hyperbolic equations discussed above. Further technical modifications 
should be applied to the local solver, in order to remove from the scheme the depen¬ 
dency of each node on itself, an issue that dramatically breaks down the causality. 
This results in a remarkable speedup for the computation, even more substantial if 
combined with a parallel algorithm. 

The paper is organized as follows: in Section 2 we briefly review, for the read¬ 
ers convenience, the patchy domain decomposition method for first order Hamilton- 
Jacobi-Bellman equations proposed in [5]. Section 3 is devoted to the extension of 
the semi-Lagrangian local solver to second order semi-linear equations, in a form suit¬ 
able for our purposes. In Section 4, we show how to adapt the patchy decomposition 
method to the more general setting of equations with diffusion, in particular we es¬ 
tablish conditions on the parameters that guarantee the effectiveness of the dynamic 
domain decomposition for the parallel computation. Finally, in Section 5, we present 
several numerical tests in dimension two, in order to show the performance of the 
proposed method compared to its static version. 

2. Patchy decomposition for first order HJB equations. In this section we 
review the patchy domain decomposition method, proposed in to solve boundary 
value problems for first order Hamilton-Jacobi equations of the form 


( 2 . 1 ) 


max {—f(x, a) ■ Vu(a;) — Ra;, a)} = 0 

aeA 

u{x) = g{x) 


X G 

X G dQ 


where C M" is an open set and A C K™ is a compact set. Moreover, f : fix A ^ K" 
is a vector field and u : fl ^ M., I : fl x A ^ M., g : dfl -A M are scalar functions. 

It is well known that equation (2.11 can be interpreted, via the celebrated dy¬ 
namic programming principle, as the Hamilton-Jacobi-Bellman equation associated 
to a suitable deterministic optimal control problem. Indeed, let us consider the fol¬ 
lowing controlled dynamical system 


f y{i) = f{yit),ait )), t>0 
1 2/(0) =x 


where y{-) is the state of the system and a(-) is a generic function belonging to the 
following set of admissible controls: 


A = {a : [0, -l-oo) —>■ A, measurable} . 
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We denote by y{t;x,a{-)) the solution to (2.2) starting from x G using the control 
a(-), and we define the first time arrival to the boundary dfl as 

t{x, a(-)) = inf {t > 0 : y{t] x, «(•)) G 5^^} • 

Finally, we consider the following functional: 


J{x,a{-)) = 




l{y{s-,x,a{-)),a{s)) ds + g{y{T{x,a{-)))) . 


In this setting, the minimum time problem with running cost I and exit cost g consists 
in finding, for each a; S O, an optimal control Q!*(-) G A that minimizes J among all 
the admissible controls. Under suitable assumptions on the data, it can be proved 
that the value function of the problem, i.e. 

u(x) = inf J(x,a(-)) x G fl 
a(-)GA ^ ’ 


is the unique viscosity solution to (2.1). We refer the reader to [T] for the details. 

The main advantage of this approach is that, once the value function u is com¬ 
puted, we can quite easily synthesize an optimal control for the minimum time prob¬ 
lem, by taking 


(2.3) 


a* (x) = arg min {f{x, a) ■ Vu(x) 

aeA 


l(x,a)} 


X G fl ■ 


Note that a* is in feedback form, namely it depends only on the the state an not on 
the time. It follows that, for each x G kl, we can compute the optimal trajectory y*{-) 
starting at x (i.e. a characteristic curve of the hyperbolic equation (2.1)) by simply 
plugging a* in the dynamical system (2.2): 


y*it) = f{y*it), a*{y*{t))) , 

2 /*( 0 ) = X 


t > 0 


This is not the case for other types of techniques, based on the Pontryagin maximum 
principle, which provide only necessary conditions for optimality and sub-optimal 
open-loop controls (i.e. controls that depend on time). 

Unfortunately, from the numerical point of view, the Hamilton-Jacobi approach 
implies severe computational efforts, since it requires the computation of the value 
function u on the whole state space Q. Considering that industrial applications de¬ 
mand the solution of optimal control problems at least in dimension six (as for the 
most simple second order dynamics in with controlled acceleration), this approach 
still suffers the curse of dimensionality mentioned in the Introduction. For this rea¬ 
son, the development of efficient parallel algorithms for Hamilton-Jacobi equations 
is nowadays an active field of research. The patchy domain decomposition method 
places itself exactly in this context. 


The semi-Lagrangian discretization of equation (2.1) is postponed to the next 


section, in a generalized form that recovers first order equations as a particular cases. 
The interested reader can refer to [S] for the original version of the scheme. Here, we 
prefer to keep the discussion free from technical details, and mainly focus on the ideas 
behind the construction of the parallel algorithm. 

We consider two different discretizations G and Gc of the state space U. The grid 


G denotes the actual grid on which we want to solve the problem (2.1), also termed 


the fine grid, whereas Gc is very coarse compared to (and possibly contained in) G. 
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The first step of the method consists in computing a coarse solution Uc on Gc, also 
via parallel computations using a standard (static) domain decomposition technique. 
Due to the low resolution of the grid, this is a very cheap operation, but gives a first 
rough approximation on G, say ii, of the actual solution u. It is reconstructed by 
interpolation of Uc on G. 

Now we employ u to compute a feedback optimal control a*, via an appropriate 
discrete version of the synthesis procedure (2.3), presented in Section]^ We stress 
that a* is just a coarse approximation of the actual optimal control a*, but it is defined 
on the fine grid G. This is enough to start the construction of the patchy domain 
decomposition. 

We divide the boundary dO, in a fixed number Np of disjoint sub-sets, denoted 
by Tp, with p = l,...,7Vp. Then, for each p, we compute the sub-domain flp C fl 
as the numerical domain of dependence of Tp through the optimal dynamics /*(•) = 
/(’,«*(■))• Let us clarify this point in a continuous setting and postpone the actual 
implementation to Section We denote by xt^ the characteristic function of the sub¬ 
boundary Tp, and consider the following Cauchy-Dirichlet problem for the advection 
equation: 


(2.4) 


dt(j)p{x, t) — /(a:, d*{x)) ■ V(j)p(x, t) = 0 (a;, t) € fl x (0, -l-oo) 


(()p(x,0) = 0 

= Xr^{x) 


X G fl 

(x, t) G dil X (0, -l-oo) 


It is well known that the boundary datum acts as a source of information, that flows 
(backward) in along characteristics, according to the drift f*. The limit in time 


(/)“(a:)= lim (j)pix,t) 

^ t—>- + oo 


is still a characteristic function, since the hyperbolic equation preserves the properties 
of xr (e.g. the maximum and the singularities). Then, we define the p-th patch of 
our dynamic decomposition as 

flp = {a: e fl : 4>'^{x) = 1} . 

We remark that each patch is a bundle of characteristics enjoying, by construction, 
the fundamental property of being invariant with respect to the optimal dynamics, 
i.e. f*{flp) C flp. This is not completely true, since f* is built using only the coarse 
control a*. Moreover, at the discrete level, the projection of the patches on the grid 
G introduces an additional error, in particular if the dynamics /* defines very bended 
characteristic curves. Figurej^shows the patchy decomposition for two test dynamics, 
in dimension two and three respectively. Note that, by construction, the patches do 
not overlap, sharing only sharp interfaces. 

The next step is the parallel computation on the patches. This can be done 
avoiding completely the transmission conditions, exploiting the invariance property 
just discussed above. Since this feature is weakened at the discrete level, it can be 
enforced in the computation by imposing state constraint conditions at the interfaces. 

Finally, all the solutions are merged together, producing a solution on the whole 
domain 17. As expected, this patchy solution is slightly different from that computed 
using a standard domain decomposition method. Nevertheless, the error is localized 
at the interfaces between the patches and does not propagate in the interior of the 
sub-domains, provided that the grid Gc is not too much coarse compared to the fine 
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grid G. This is shown in by numerical evidence, but a precise error estimate is still 
missing. 

Despite the small errors, the absence of transmission conditions can give to the 
parallelization a remarkable speedup. In this respect a key role is played by the 
relative sizes of the patches. Indeed, we remark that the construction of the patchy 
decomposition is completely driven by the dynamics of the optimal control problem. 
Hence, even a subdivision of the boundary 9D in sub-sets of the same size can produce 
a highly unbalanced domain decomposition. In these cases the performance of the 
parallelization is very poor, since the processors associated to the smaller patches 
complete their job (and remain idle) much earlier than those corresponding to the 
larger patches. This drawback was pointed out in [5] and can be overcome via a 
multi-level technique, which is currently under development. The idea is to alternate 
the construction of the patchy decomposition with the computation of the patchy 
solution. More precisely, one can start and continue the construction of the patches 
as long as they have about the same size, obtaining a first level of balanced sub- 
domains. The solution is then computed on the first level sub-decomposition. The 
new boundaries (possibly divided again in sub-sets of the same size) are employed 
to start and build the second level sub-decomposition. Moreover, the values of the 
first level solution at the corresponding points (correct values due to the causality 
property of hyperbolic equations) are assigned as boundary data for the computation 
of the second level solution. This procedure is then iterated and terminates when the 
sub-decompositions cover the whole domain D. 

3. A semi-Lagrangian scheme for second order HJB equations. In this 
section we present a semi-Lagrangian discretization for a special class of stationary 
second order semi-linear equations. The resulting scheme will be employed as local 
solver for the extension of the patchy method to this more general setting, as discussed 
in the next section. 

The prototype problem we have in mind is the following boundary value problem 
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for second order semi-linear equations, written in a control theory perspective: 

{ max{—'D(x,a)uix) — l(x,a)} = 0 x G ft 
u{x) = g{x) X G dVt 

where 17 C K" is an open set, u : 17 —)■ R, A C M"* is a compact set and the second 
order differential operator T) is given by 


Dix, a) 


aikix,a)ajk{x,a) 

ij —1 \k—l 


dxidxj 


+ ^Mx,a) 

Z=1 


A 

dxi 


with cr : 17 X A —)■ £(IR“; K"), f : n x A ^ K" and Z : 17 x A — )■ R, 5 : 917 -A R. 

The crucial point here is that the connection with deterministic optimal control, 
depicted in the previous section, is lost due to the presence of the diffusion term a. 
Indeed, in this case, characteristic curves are no longer well defined by the system 
of controlled ordinary differential equations (2.21. Nevertheless, a weak notion of 
characteristics is still available, interpreting these curves as generalized trajectories^ 
namely solutions to the following system of controlled stochastic differential equations 
with dynamics /: 


f dX{t) = f{X{t),a{t))dt + a{X{t),a{t))dW{t), t>0 
\ X(0) = a; 


Here X(t) is a progressively measurable process, representing the state of a system 
evolving in 17 starting from x, the process a(t) is the control applied to the system 
at the time t with values in the control set A and W{t) is a d-dimensional Wiener 
process. We define the set of admissible controls 


A = {a : [0, -boo) —>■ A, progressively measurable} 


and we denote by X{t;x,a{-)) the solution to (3.21 starting from x using the 
«(•). Finally, we define the first time arrival to the boundary 917 as 


control 


r(x,a(-)) = inf{7 > 0 : X{t;x,a{-)) G 917} 


and we consider the following cost functional (E stands for the probabilistic expecta¬ 
tion): 


J(x, «(•)) = E ■ 




l{X{s;x,a{-)),a{s)) ds +g{X{T{x,a{-)))) \ . 


In this setting, the unique viscosity solution u to the problem (3.1) can be interpreted 
as the value function of the following stochastic minimum time problem with running 
cost I and exit cost g: 


u(x) = inf j(x,a(j) x G 17 . 
ai-)GA ^ ’ 

Typical assumptions on /, g, I and a for the well posedness of the problem is bounded¬ 
ness and Lipschitz continuity in space uniformly in the control. We refer the interested 
reader to [7] and the references therein for further explanations and details. 






Following [7], the semi-Lagrangian discretization of equation (3.1) can be per¬ 
formed introducing a time step h, interpreting the first order term in the operator D 
as a directional derivative of u along the dynamics / and approximating the Wiener 
process in the stochastic differential equation (3.2) by means of discrete time Markov 
chains. We introduce in the state space a structured grid G with uniform step Ax in 
each coordinate direction and nodes Xi for i = This leads to the following 

scheme, which is a nonlinear system in the form of a fixed point operator: 


(3.3) 


^Tk = ^1 + hfixi, a) + sVhakixi, a) 


(xi) = min ^ ^ u{x'^;D + hl{xi,a) 


j.eA 2d 


u{xi) = g{xi) 


k=l s=±l 


s = ±1, k = 1 ,..., d 
X2 € G n n 
Xi G Gn dfl 


where ak denotes the fc-th row of a. The evaluation of the min operator is done by 
direct comparison discretizing the control set A with Na points. Moreover, as usual 
in semi-Lagrangian approximations, the 2d points do not lie in general on the 
grid and the values of u at the these points need to be reconstructed by interpolation. 
A natural choice here is the bilinear interpolation, as shown in Figure 



Fig. 2. An example in dimension 2 with cr : fl x A ^ £(]R^;]R^). The arrow represents the 
vector field f(xi,a) for a generic control a a A. For s = ±1 and k = 1,2 the values at (white) 
points are reconstructed via bilinear interpolation of (black) grid nodes. 


We choose a time step h such that, for each node xt G G and every control a G A, 
the points a;“’^ fall in the first neighboring cells of Xi. The reason is twofold. First, 


it is easy to prove that the semi-Lagrangian scheme (3.3) is consistent to equation 
only with order Oih) and we do not want to get a too poor accuracy in the 
approximation of the trajectories. Second and more important, we want to keep the 
stencil of the scheme strictly localized, in order to reduce the computational efforts. 
Now, let us denote by x^^, the four points involved in the 

respectively. 


a.s a.s a.s a,s 
^Z0,fe’ 

reconstruction of the value u{x1'^), with weights A“’® 


\a,s \o,tS \Q-,s 
^12,ki ‘^i3,k 

Without loss of generality, we can assume that = Xi for each k = l,...,d and 
s = ±1. It follows that 


/ a,s\ \ ^ 

^i^^k) = 


p=0 p—1 
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Substituting this expression in the scheme (3.3) we get 
(3.4) 

uix,) = mm I ^ ^ ^ ^ E E E 


k—1 s=±l 


+ hl{xi, a) 


fc=ls=±lp=l 


which shows explicitly the dependency of u{xi) on itself. 

As explained in the Introduction, in the hyperbolic case (cr = 0) relevant infor¬ 
mation starts from the boundary of the domain, and propagates along characteristics 
backward in time at finite speed. At the discrete level, we can try to mimic this 
causality property, by means of an appropriate ordering of the grid nodes that tracks 
the front of information. In this way, we can get a cascade effect that decouples, 
at least partially, the nonlinear system, drastically reducing the number of iterations 
to reach convergence. Despite this property does not hold in the continuous case in 
presence of diffusion (cr ^ 0), we can still have a numerical causality if some suitable 
condition on the parameters is satisfied. We will come back on this important point 
in the next section. 


Here the key argument is that we have to remove in (3.4) the self-dependency 


on u{xi), which makes the scheme strongly iterative by construction. This is a really 
crucial step, if we hope to benefit of the speedup induced by the causality property. 
To this end, we denote by 


1 






2d^ ^ ’ 

k—1 s—ztl 


so that 


(3.5) 




2d 


k—1 s—dzl p—1 


M(a:i) = min{Ao(a)u(xi)-I-Ai(a)} . 

aeA 


We remark that Ao(a) and Ai(a) do not depend on the value u{xi). Moreover, we 
note that Ao(a) < 1 for all a S A, since the interpolation weights can be 

simultaneously equal to 1 only if the time step h = 0. Then, it is well defined the 
value 


(3.6) 


v(xi) = min 
a^A 


Ai(a) ] 

1 - Ao(a) J 


and we claim that v{xi) = u{xi). This kind of explicitation is trivial in the linear case 
without diffusion (i.e. for / not depending on the control a and cr = 0), but it is less 
evident (and to our knowledge also quite surprising) in the general nonlinear case. So 
let a* £ A be the control achieving the minimum in (3.5), i.e. 


u{xi) = Ao{a*)u{xi) + Ai{a*). 


This implies 


u{xi) 


Ai{a*) . f Ai(a) 

-;-;-- > mm < -;- 

1 — Ao(a*) aeA ( 1 — Ao(a) 


v(xi). 


Conversely, let a* £ A be the control achieving the minimum in (3.6), i.e. 


v(xi) 


Ai(o*) 

1 - Ao(a*) ■ 
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This implies 


u{xi) < Ao{a*)u{xi) + Ai{a*). 


Since Ao(a*) < 1, we get 


/ N Ai(a*) , , 

u[x,) < - - = v{xi). 

1 - Ao(a*) 

Finally, we obtain the following modified scheme 
x'^'^ = Xi + hf{xi,a) + sy/hahixi^a) 


(3.7) 


s = ±1, fc = 1 ,d 


2d 




u(xi) = min < 
a^A 


/c=l s=±l p=l 




k=l s=±l 


. uixi) = g{xi) 


/ Xi € dr n n 


Xi s G n dVl 


where the value at each node Xi now depends only on values at nodes different from 
Xi- In Section]^ we compare this scheme with the original scheme (3.3), showing the 
effectiveness of the modification in terms of iterations to reach convergence. 

We conclude this section by remarking that the proposed semi-Lagrangian scheme 
can handle by construction very degenerate diffusions. Indeed, at points where the 
diffusion coefficient cr = 0, we naturally recover the semi-Lagrangian scheme presented 
in [5] for first order Hamilton-Jacobi-Bellman equations (see also the numerical ex¬ 
periments in Section]^. This is not the case for other types of discretization, e.g. 
finite elements, where the degeneracy in the diffusion produces instabilities that need 
to be treated with very specific techniques. This is due to the fact that the variational 
forms associated to the equations under consideration suffer a lack of coercivity. We 
refer the interested reader to m for details and insights on this topic. 


4. Patchy decomposition for second order semi-linear equations. In this 
section we aim to extend the patchy decomposition method to the class of second order 
semi-linear equations presented in the previous section. 

The main idea is really simple: we first compute the patchy domain decomposition 
in the hyperbolic case with cr = 0. Then we solve in parallel the full equation with 
cr ^ 0 using the patchy decomposition. 

At a first look this attempt could seem meaningless. Indeed, due to the diffusion, 
information spreads instantaneously from the boundary to the whole domain and then 
it cannot be confined in independent sub-domains, as for the first order case. This 
implies that, in order to compute the correct solution in this more general setting, we 
cannot replace the transmissions between the patches with state constraint boundary 
conditions, as discussed in Section So we loose the main advantage of the patchy 
decomposition compared to an arbitrary and static decomposition. Moreover, we 
recall that the preliminary step in the computation of the dynamic decomposition 
results in an additional cost in terms of CPU time. The reader can easily convince 
himself that, in presence of transmission conditions, the performance of a parallel 
computation does not significantly depend, in general, on the particular shape of the 
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sub-domains, but only on their number. As pointed out in [6], if we assume that the 
sub-domains have about the same size, then the overall time consumption to compute 
the solution almost exclusively depends on the number of processors involved in the 
computation and the transmission delays. 

Nevertheless, we show through the next sections that the patchy method with 
transmission conditions can still be competitive if we combine two different features, 
described in the following sub-sections. 


4.1. The up-wind diffusion ball condition. Here we present the first ingre¬ 
dient for the extension of the patchy method to second order semi-linear equations. 
In particular we show that, under suitable relations between the parameters, the 
discretization of equation ( |3.1| ) behaves in a sense more like hyperbolic than elliptic. 

To simplify the presentation we consider a special case in dimension two. Let 
e > 0 and take a : —>■ £(K^; K^) of the form 

(4.1) 


Equation (3.11 writes 


—eAu(x) + max {—f(x, a) ■ Vm(x) — l(x, a)} = 0 x G ft 

aeA 

u(x) = g(x) X € dfi 


and the semi-Lagrangian scheme (|3.7[) simplifies in 


^Tk = + hf{xi, a) + sy/2eh Ck 


s = ±1, fc = 1,2 




u(xi) = min 
a£A 


. uixi)=g{xi) 


k=l s=±lp=l 


k=l s=±l 


/ Xi £ 17 n H 


X, S G n dQ 


where denotes the fc-th element of the canonical base of 

As discussed in Section for each node Xj and control a G A, we want to 
reconstruct the value of u at the four points x“’^ (fc = 1,2, s = ±1), via bilinear 
interpolation of the first neighboring grid nodes. To this end, it suffices to choose a 
time step h such that 


(4.2) 


/i/max + < Ax , 


where /max denotes the maximum of the dynamics / on H x A. Note that condition 
(4.2) can be localized for each node Xi and control a G A. This results in a more 
accurate approximation, but it is clearly more expensive from the computational 
point of view. 

Now consider the half space defined by tt = {x S : f{xi, a) • (x — Xi) > 0} and 
also the ball B of radius \/2eh centered at x^ -I- hf{xi,a), enclosing the four points 
that need to be reconstructed (see Figure]^. The key argument of our construction 
is that, if the dijfusion ball B is contained in the half space tt, then the value u{xi) is 
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(a) 



Fig. 3. The upwind dijfusion ball condition: (a) B C tt, the interpolating (green) grid nodes 
are upwind w.r.t. the vector field f; (b) B % , some downwind (red) grid nodes enter in the 

computation. 


computed using only grid nodes that are upwind with respect to the vector field /, as 
shown in Figure[^(a). If we combine this property with a suitable order for processing 
the grid nodes (aka causality, see next subsection) we can accelerate the convergence 
of the scheme in the same spirit of fast-marching methods, reducing significantly the 
number of iterations. On the contrary, if part of B crosses tt, then also downwind 
nodes are employed in the reconstruction of u{xi), as shown in Figure [^(b). It 
follows that some information is flowing in directions opposite to the vector field /, so 
that u{xi) cannot be computed in a single pass fashion and the number of iterations 
to reach convergence increases. This is clearly expected, since we are considering 


the particular example (4.1) in which the diffusion uniformly spreads information in 
all the directions. But the crucial point here is that the semi-Lagrangian scheme 
propagates diffusion at speed and not instantaneously as in the continuous case. 
Indeed, at each iteration, we move with discrete time steps from the point Xi to the 
points x°(’^ (fc = 1,2, s = ±1), adding the contribution of the vector field / and the 
diffusion. Then, the upwind condition on the diffusion ball holds if the time step h 
satisfies also 


(4.3) 


/l/inin - > 0 , 


where /min denotes the minimum of the dynamics / on fl x Note that also (4.3) 
can be localized for each node Xi and control a 
always consider only global conditions. 


G A, but in the following we will 

We now relate the discretization parameters to the data, aiming to satisfy both 

mm with a > 0 to be 
in. Substituting in 


conditions (4.2) and (4.3). To this end, we set h = aAx/f, 
determined. Moreover, we define w = /min/2e and T = /max//i 


(4.2), we get 


laAx 

\ - < ^ 

V (jJ 


aT)Ax , 


which implies, for a < 1/T and by straightforward computations, the following con- 
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dition: 


(4.4) 


1 \/l + 4ujAxT — 1 

T ~ 2a;Aa;T2 


with a satisfying 0 < a < 1/T. On the other hand, substituting in (4.3), we immedi¬ 
ately get 


(4.5) 


a > 


1 

ujAx 


a. 


By imposing the compatibility of the two conditions, i.e. a < a, we easily obtain the 
following upwind dijfusion ball condition: 


(4.6) 


1 ^ Ax 
w ^ 1 + T■ 


Sum mariz ing, if condition (4.6) is satisfied, we can choose a G (a, a) such that both 
(4.2) and (4.3) hold. In particular, it is easy to check that a = is the only value 
satisfying a < a < a for all oj and Ax such that (4.6) holds (see also Figure]^. 
Conversely, if (4.6) fails, only one of the two conditions can be fulfilled, by choosing a 



Fig. 4. The value a = (in blue) and the graphs of a, a (in red and green respectively) as 
functions of IfuiAx, ranging in (0, jipr) prescribed by condition i4.6i. 


appropriately. We stress again that, for accuracy and computational issues discussed 
in Section Jsl from now on we will always satisfy condition (4.2), possibly loosing 
condition (4.3). 


In the general case (3.1), where the diffusion coefficient is given by t he matrix¬ 
valued map a : id X A —t £(IR'^;IR"), we obtain the same condition (4.6) with to = 
/min/lkll^, where 


max max 
(ai,a)Gnx A k—l,..,d 


\ 


/ > ^km 
m—1 




We remark that ||cr||oo identifies the diffusion vector with greatest length among the 
rows of tr, namely an upper bound for the radius of the diffusion ball. 


For a fixed and small mesh size Ax, condition (4.6) can be clearly fulfilled if 
the controlled advection dominates diffusion, i.e. for uj » 1. Note that w is a 
characteristic quantity of the problem. It resembles the Reynolds number for Navier- 
Stokes equations and the case w >> 1 is of great interest in the applications [H]. In 

























14 


this perspective, condition (4.61 describes an interesting threshold effect, by means of 
the mesh dependent parameter (1 + T). Indeed, for l/w < tax, the upwind 

diffusion ball condition is satisfied, and the semi-Lagrangian scheme exhibits the same 
peculiarities of the hyperbolic case. Conversely, for l/w > tax, the upwind diffusion 
ball condition fails and an elliptic behavior emerges. Note that this effect is also 
related to the degree of anisotropy T > 1. Equivalently, we can fix uj, depending on 
the data /, tr, and choose the mesh size Ax according to the threshold = (H-T)/a;. 
It follows that, at a coarse scale, i.e. for Aa; > Ti^, the approximation “looks like” that 
of a controlled advection equation. Again, this effect also depends on the degree of 
anisotropy, in the sense that the larger is T the coarser should be the mesh. On the 
other hand, at a sufficiently fine scale Ax < diffusion starts to get noticed and this 
hyperbolic behavior disappears as Ax —)■ 0. This is expected, due to the consistency 
of the semi-Lagrangian scheme with the considered semi-linear equation. 

Finally, we remark that, from a numerical point of view, the effect of the upwind 
diffusion ball condition (4.6) on the performance of the scheme can be less relevant 
than expected. Indeed, even if the diffusion ball B is contained in the half space tt 
associated to /, some grid points involved in the interpolation can be outside, as shown 
in Figure This strongly depends on the alignment to the grid of the optimal vector 



Fig. 5. The upwind dijfusion ball condition is satisfied, but some downwind (orange) grid nodes 
enter in the computation with small interpolation weights. 


field achieving the minimum in (3.7). Nevertheless, as confirmed by the numerical 


experiments in Section this drawback does not significantly affect the hyperbolic 
behavior of the scheme, since the downwind grid points contribute in these cases with 
very small interpolation weights, compared to the upwind ones. 


4.2. The role of causality. Under the regime given by condition (4.6), we have 


a chance to get a good performance of the patchy method. To this end, the second 
crucial point for letting work all the machinery in this more general setting is the 
causality property of hyperbolic equations outlined in the previous section, a quite 
well established topic in the literature. 

Starting from the work by Tsitsiklis m and then Sethian na, a lot of research has 
been devoted to find an implicit order of the nodes of a grid that allows to compute 
the solution in just one or very few iterations, the so called single-pass property. 
With this idea in mind, the celebrated fast-marching method has been proposed to 
solve the eikonal equation. It can be proved that, in this special case, the right 
order corresponds to process the nodes by ascending values, progressively accepting 
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as correct (and then removing from the computation) the node with the minimal value. 
This translates in computing the solution following its level sets, namely propagating 
information along its gradient curves. Since for the eikonal equation gradient curves 
coincide with characteristics, we get the correct solution. 

Unfortunately, this ordering is optimal only for eikonal equations of isotropic type, 
namely for a dynamics / that does not depend (or depends in a weak sense) on the 
control variable. We refer the reader to m for a detailed explanation of this problem. 
Whenever a strong anisotropy comes into play, the choice of a correct order for the 
grid nodes becomes a subtle topic, meaning that it may not even exist, despite the 
causality property always holds at the continuous level. This goes beyond the scope 
of this paper, but we refer the reader to [4] for a discussion on the applicability of 
fast-marching-like methods to general Hamilton-Jacobi equations. We only point out 
that, in order to compute the correct solution, one has to enlarge the stencil of the 
scheme and/or give up the single-pass property mentioned above. This results in more 
computational efforts and more iterations for the scheme to reach convergence. 

We mention here also the fast-sweeping method [I21II9] and some of its gener¬ 
alizations [5], based on another technique that, in a weaker sense, still exploits the 
causality property. The grid nodes are alternatively swept in a pre-determined number 
of directions according to the dimension of the problem, until convergence is reached. 
This makes the method iterative by construction, but it allows to compute the so¬ 
lution to very general equations of Hamilton-Jacobi type. The number of iterations 
to reach convergence is strongly dependent on the problem and the mesh structure. 
Nevertheless, it can be proved that only 2n sweeps are needed to compute the solution 
to the eikonal equation in dimension n. 

In Section we denoted by u the interpolation on the fine grid G of the coarse 
solution Uc on the coarse grid Gc- Our strategy is then to sort the grid nodes of 
G in a fast-marching fashion, according to the increasing values of u. This can be 
accomplished with some additional but cheap time consumption, using some state- 
of-the-art sorting algorithm, embedded in the pre-computation step. Note that this 
procedure was already mentioned in [5] as a possible add-on for the original patchy 
method. Here it becomes an essential part of the proposed method. We will see in 
the next section that this ordering of the grid nodes, even if in general sub-optimal, 
can give an exceptional speedup to the computation. 


4.3. The patchy algorithm. Here we summarize all the implementation steps 
of the patchy method for second order semi-linear equations. To this end, we present 
both the discrete version of the procedure (2.3) for the synthesis of the feedback 
optimal control and the atcual construction of the patchy decomposition discussed in 
Section [21 

We consider the non modified semi-Lagrangian scheme (3.3) in the special case 
without diffusion, i.e. ct = 0: 


u{xi) = min {u[xi 
a^A 


hf{xi,a)) + hl{xi,a)} Xi S G n H 

I u{xi) = g{xi) XiSGnSH 

Once the value function u is computed, we easily get 

14 71 a*(xi) = argmhi {u{xi + hf{xi, a)) + hl(xi, a)} Xi S G n H . 

' ■ ' aeA 


Note that the computational cost of this operation mainly depends on the number of 
points Na used to discretize the control set H, but also on the bilinear interpolation 
for the reconstruction of u at the points Xi -\- hf(xi,a). 
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We proceed with the construction of the dynamic decomposition. We divide the 
boundary nodes of G in a fixed number Np of sub-sets Fp, with p = 1, Np, defining 
the initial guess 




1 € Fp 

0 otherwise 


Then, for each p and a fixed tolerance ep, we iterate until convergence the scheme 


(4.8) 


= 4'p{xi + hf{xi,a*{xi)) Xi G Gnfl, 


which is a semi-Lagrangian discretization of the advection equation (2.4). Note that, 
due to the interpolation and differently from the continuous case, scheme (4.8) spreads 
the sharp values {0,1} of the initial guess, producing a final solution (ftp valued on 
the whole interval [0,1]. This makes the patches fuzzy sub-sets of ft. Hence, an 
additional thresholding procedure is needed to define them correctly. Indeed, for a 
fixed Tp G (0,1), we set 


ftp = {xi G G Dfl : 4>p(xi) > Tp} 


In practice, we loose some information, projecting <j)p on the space of characteristic 
functions. Nevertheless, this allows to choose a poor tolerance ep, accelerating the 
convergence of the scheme. Moreover, by tuning the threshold parameter rp, we can 
achieve a desired level of overlapping between the patches, including sharp interfaces. 
We recall that, by condition (4.2), the stencil of the semi-Lagrangian scheme (3.7) 


consists of hrst neighboring nodes. In addition, in this more general setting, we can 
no longer impose state constraint conditions at the boundaries of the patches. Then, 
some overlap is required, if we work with distributed memory architectures. On the 
other hand, sharp interfaces are still possible, if we prefer a shared memory architec¬ 
ture, as the one employed for the numerical experiments presented in the next section. 


Finally, the new patchy algorithm summarizes as follows: 


Initialization: 

• Build two grids G, 


• Fix tolerances e, e^, 

• Build the initial guess u[! on G, 


and G such that Gc << G. 

c, ep, the number of patches Np and the threshold Tp. 


iterating the scheme (3.7) with cr = 0 


equal to the exit cost g on G n and -l-oo 

(i.e. a very big value) otherwise. 

Pre-computation: 

• Starting from m[!, compute Uc on Gc 
until convergence (up to Ec). 

• Build u on G by interpolation of Uc. 

• Compute d* on G using u in synthesis procedure (4.7). 

• Divide the boundary nodes of G in Np sub-sets. 

• For p = 1,..., Np, use d* to compute on G the patch ftp (with threshold Tp), 


iterating the scheme (4.8) until convergence (up to ep). 

• For p = 1,..., Np, sort the nodes of ftp according to the increasing values of 
u (to exploit causality). 

Computation: 

• For p = l,...,Np, starting from u, compute Up on ftp, iterating the scheme 
(3.7) until convergence (up to e). At each iteration, update Up at the nodes 
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where flp intersects other patches, using the transmission condition 

Up(xi) = min{Mp(a;i), Ug{xi)} for all q p such that 0 7 ^ flp n 12 ^ 9 

• Build the solution u on G, merging all the Np solutions Up on flp. 

Note that all the steps of the method can be parallelized. In particular, the solution 
Uc on the coarse grid Gc can be computed by means of a standard domain decom¬ 
position technique. In the following section we will compare this dynamic domain 
decomposition to the classical static domain decomposition. The interested reader 
can find in m a good introduction to this topic and in mi [5] some static domain 
decomposition methods for first order Hamilton-Jacobi equations. 

We conclude this section by remarking that, in the case of second order semi-linear 
equations, the issue of balancing the size of the patches can no longer be addressed via 
the multi-level approach discussed in Section for first order equations. Indeed, due 
to the presence of the diffusion, we are not guaranteed that the values of the solution 
at the boundary of the current level decomposition are correct. Some information 
could flow back in the future from patches that have not yet been built. In principle, 
we can continue the construction of the decomposition, postponing the computation 
of the solution. To this end we need more processors, exactly NpNp, where Np is 
the number of levels, which is clearly bounded but a priori unknown. In the case of 
only Np available processors, an alternative could be to find an iterative method to 
solve the following optimization problem: build an initial subdivision of the boundary 
such that the corresponding patches have about the same sizes. An interesting 
question is to understand if the additional computational cost of this optimization 
procedure is compensated or not by the balancing in size of the patches. This method 
is at present under development. 


5. Numerical experiments. In this section, we present some numerical ex¬ 
periments in dimension two, performed on a server Supermicro 8045C-3RB using 1 
CPU Intel Xeon Quad-Core E7330 2.4 GHz with 32 GB RAM, running under the 
Linux Gentoo operating system. The aim is to emphasize the features of the pro¬ 
posed method. In particular, we first show that the modified semi-Lagrangian scheme 
(3.7) allows for a substantial reduction of the number of iterations needed to reach 
convergence, compared to the original scheme (3.3). Next we show that it is able to 
compute the solution to general nonlinear equations with very degenerate diffusion. 
Finally, we combine the scheme with the upwind diffusion ball condition (4.6) and the 
fast-marching-like sorting of the grid nodes, in order to get an extra reduction of the 
computational time. This will confirm the effectiveness of the extension of the patchy 
method to the more general setting of second order Hamilton-Jacobi equations. 

In all the following tests, we set the domain H = [—1,1]^ and the control set 
A = Bi (the unit ball centered in the origin), which is discretized by means of 16 
points. Moreover, if not differently specified, we take the boundary datum g{x) = 0, 
the running cost l{x,a) = 1 and the diffusion a{x,a) = •\/2e/2, where £ > 0 and 
I 2 denotes the 2x2 identity matrix. The corresponding second order operator in 
equation (3.1) is just —eAu, i.e. the laplacian with diffusion coefficient e. Finally, we 
denote by xs the characteristic function of a generic subset S' of H. 

Now, let us give a list of typical problems, depending on the choice of the dynamics 
fix, a): 

A) fix, a) = b{x) for a given vector field 6 : H —>■ The corresponding equation 

is a stationary advection equation along b with uniform source: 


bix) ■ Vw(a;) = 1. 
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B) /(x, a) = c(x)a for a given positive speed function c : fi U. In this case 
it is easy to see that the max in (2.11 is achieved for a = —VM(a;)/|Vu(a;)|. 
The corresponding equation is the eikonal equation, whose solution u{x) rep¬ 
resents the minimum time to reach the boundary dfl starting from x G and 
traveling at speed c: 


c(a;)|VM(a;)| = 1. 

For c = 1 we recover the distance function from the boundary dCl. 

C) f{x, a) = Yq:jy |2 + 2 “) > where rj e {0,1} is a fixed switch parameter to 

activate/deactivate the control and Re is a hxed counter-clock-wise rotation 
with 0 < ^. The corresponding equation is a controlled advection equation 
with uniform source, also termed the Zermelo navigation problem. This is 
an example hard to compute, due to the strong anisotropy introduced by the 
control, namely the dependency of the speed \f{x, a)| on the direction a. See 
below for further details. 

Whenever e ^ 0 we use to add the term diffusion to declare the presence of the 
laplacian, as for the following eikonal-diffusion equation 


f —£Ait(a;) -I- c(a;)|VM(a:)| = 1 x G id 
I u{x) = 0 x G dVt 


5.1. Self-dependency removal. Here we compare our modihed semi-Lagrangian 
scheme ( |3.7[ ) and the original one (3.31 without any removal of the self-dependency. 
In Table [Ij we report the number of iterations needed to converge for both schemes in 
the cases listed above. All the solutions are computed on a 50 x 50 grid. 


Table 1 

The modified SL scheme VS the original one plsl i. Number of iterations to reach conver¬ 

gence for different test dynamics and diffusion coefficients. 


Equation 

Dynamics 

£ 

SL (|3.3|l 

SL l|3.7|l 

A 

b(x) = (1,0) 

0 

176 

2 

A 

b{x) = (1,0) 

le-2 

458 

102 

B 

c(x) = 1 

0 

145 

26 

B 

clx) = 1 

le-2 

353 

98 

B 

c(x} = l-l-X|^i>o>(a;) 

0 

276 

33 

B 

c{x) = l + 

le-2 

305 

78 

C 

r} = l 

0 

716 

116 

C 

r] = l 

le-2 

961 

325 


For a pure advection dynamics (A) with vector field b = (1,0) and e = 0, the causality 
property is explicit, in the sense that, starting from the left boundary, information 
propagates to the right in the whole domain. Since, by default, we process the grid 
nodes from left to right and from top to bottom, then the modified SL scheme con¬ 
verges in this case in just one iteration. Each node is computed (backward in time) 
using only nodes on its left, hence it is correct by induction (see Figure [^). Note 
that the additional iteration reported in the table is the one needed to check the con¬ 
vergence. On the contrary, the original scheme employs the value of a node (which 
is set at the beginning to a big value as initial guess) to compute itself. It follows 
that the correct value carried by the left neighbor is changed by the interpolation and 
several additional fixed-point iterations are needed to fix it. The case of the eikonal 


















Fig. 6. Optimal vector fields for the stationary advection equation (a) and the eikonal equation 


(b). 


equation (B) with uniform speed c = 1 and e = 0 is similar, as shown in Figure]^. It 
turns out that characteristics are straight lines, moving from the boundary until they 
intercept the diagonals of the square. Using as before the default order for processing 
the grid nodes (left to right, top to bottom), it is easy to see that, at the end of the 
first iteration, only the nodes above the diagonal xi = X 2 will contain correct values of 
the solution. For the remaining part of the domain, only the first neighboring nodes 
of the boundary are correctly updated. The same holds in the following iterations, 
as long as information reaches the middle of the 50 x 50 grid. This gives precisely 
25 iterations (plus one to check the convergence) for the modified SL scheme and 
much more (145) for the original scheme. This behavior is confirmed also for more 
complicated dynamics, as for the eikonal equation (B) with a non homogeneous speed 
function and for the anisotropic dynamics (C). We finally remark that, in presence of 
diffusion e > 0, the causality property of hyperbolic equations is lost and the number 
of iterations necessarily increases. Nevertheless, we still get a relevant reduction of 
this number for the modified SL scheme. From now on, only this scheme will be 
employed. 

5.2. Degenerate diffusion. The following tests aim at showing the built-in 
ability of the proposed semi-Lagrangian scheme to solve problems with degenerate 
diffusion. To this end, we allow the diffusion coefficient e to depend also on cc, taking 
the form e{x) = 0 .1x{ a;2>o>(x). With this choice, we consider again the eikonal- 
diffusion equation ( |5.l[ ) with speed c{x) = 1 -|- X{xi>o}{x) and we note that it is 
elliptic in n n {x 2 > 0} and hyperbolic otherwise. Accordingly, at the grid nodes 
belonging to fl n {x 2 < 0}, we recover by construction the semi-Lagrangian scheme 
designed for first order hyperbolic equations. Figure shows the level sets of the 
corresponding solution. We can immediately distinguish the two different behaviors, 
by looking at the smoothness of the solution. In particular, the sharp corners corre¬ 
sponding to the shocks in the hyperbolic part of the domain, are completely smoothed 
out in n n {x 2 > 0}. 

Now we consider the more suggestive and complicated example of the Zermelo 
navigation problem, whose dynamics is rewritten here for the reader’s convenience: 

This problem consists in reaching the boundary dO, in minimum time, starting from a 
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Fig. 7. Level sets of the solution to a degenerate eikonal-diffusion equation with a non homo¬ 
geneous speed: the diffusion is present only in the upper half of the square, affecting the smoothness 
of the solution. 


point in and using the control a. The first term in the dynamics represents a whirling 
drift from the origin to the boundary, whereas the fixed parameter rj € { 0 , 1 } allows 
to switch on/off the control. This introduces in the problem both inhomogeneity and 
strong anisotropy. It is easy to see that, for a fixed 6 < ^, the counter-clock-wise 
rotation Rg guarantees the reachability of the boundary for each starting point in 12, 
even for t] = 0, but we want to play with the control a to minimize the time of arrival. 
In Figure we compare the level sets of the solutions and the corresponding optimal 
dynamics for three different cases. The first case (see Figure [^) represents a pure 
advection-diffusion equation, with uniform diffusion e = 0.1 on the whole 12 and no 
control, i.e. rj = 0. In the second case (see Figure [^) also the control is active, i.e. 
77 = I, and we can clearly see how it produces a resistance to the drift that allows to 
reach the boundary in a smaller time. This is much more evident in the third case 
(see Figure]^) where the control is still active and the diffusion is switched off, i.e. 
e = 0. 

5.3. General nonlinear problems. This test shows the ability of the proposed 
semi-Lagrangian scheme to solve nonlinear problems with very general diffusion terms 
a and running costs I, possibly depending on cc € 12 and a = ( 01 , 02 ) S Bi. To this 
end, we consider a non homogeneous dynamics of the form 

(5.2) f(x,a)=c{x)a with c(a;) = 1 -I- max{a: 2 , maxjcci, 0}} . 

Moreover, we define de{x) = and we consider three different diffusion 

terms: 

(5.3) cri=0, a 2 ix) = ds{x)l 2 , a 3 {x,a) = de{x)a. 

Note that 172 corresponds to the usual uniform two dimensional Wiener process, 
whereas 03 defines a one dimensional Wiener process in the direction of the control 
a. Finally, we consider three different running costs: 

(5.4) ^1 = 1, ^ 2 ( 2 :) = 1 + |a;ia;2|, ^{x^a) = l + \xiX2\+ „ ■ 

Z -\- CI 2 

In Figure we show the level sets of the solutions corresponding to all the nine 
possible pairs {h,crj) for i,j = 1,2,3. We clearly see the effect of the diffusion, 
whenever applied in the upper half of the square. Moreover, we can also distinguish 
the different behavior between the 2D uniform diffusion and the ID control driven 
diffusion. 
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(a) 


(b) 


(c) 


Fig. 8. The Zermelo navigation problem, level sets of the solution and optimal dynamics: (a) 
diffusion without control (e = 0.1, rj = 0), (b) diffusion with control (e = 0.1, rj = 1), (c) control 
without diffusion (e = 0, rj = 1). 


5.4. Performance comparison. Here we present a performance comparison 
between a standard Domain Decomposition method (DD) and the Patchy Domain 
Decomposition method (PDD). Here the aim is to convince the reader that, in order 
to obtain an additional speedup in the computation, the fundamental ingredients are 
the causality property and the upwind diffusion ball condition discussed in Section]^ 
To this end, we consider the eikonal-diffusion equation with speed c{x) = 1 and the 
Zermelo navigation problem above, both in the case with uniform diffusion e on the 
whole H. We build the corresponding patchy domain decompositions starting from 
a subdivision of the boundary dfl in four parts, namely the sides of the square. A 
natural choice to perform this computation in parallel is to employ four processors. 
Figure [T0| shows, for a 100 x 100 grid, the resulting dynamic decompositions compared 
to an arbitrary and static one. Note that, in this case, the sub-domains of the three 
decompositions (the four triangles, the four spirals and the four squares respectively) 
have exactly the same size. 

We choose a starting coarse grid of 50 x 50 nodes and different fine grids, up to 
800 X 800. For each test, we sort the grid nodes of the patchy decompositions according 
to the increasing values of the pre-computed coarse solutions interpolated on the 
corresponding fine grid. As already remarked, for the eikonal equation {e = 0) this 
corresponds to the optimal order to exploit causality in a fast-marching fashion. 

In the following tables we report, for each fine grid and for different values of 
the diffusion coefficient e, the CPU time in seconds and the number of iterations to 
reach convergence for the PDD method and a standard DD method. We remark that 
PDD’s times also include the pre-computation step. 

Table|^refers to the eikonal-diffusion equation. The effect of the upwind diffusion 
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Fig. 9. Level sets of the solutions corresponding to the dynamics l |5^ and to the pairs {U, aj) 
(i^j = 1,2,3J of running costs (|5.4|| and diffusion terms (|5.3[). 



(a) (b) (c) 

Fig. 10. Domain decompositions in 4 sub-domains: (a) FDD for the eikonal dynamics, (b) 
FDD for the Zermelo dynamics, (c) a standard DD for both dynamics. 
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Table 2 

The eikonal-diffusion equation: FDD vs DD in terms of CPU time in seconds (and number 
of iterations) to reach convergence. Values in the first row of each cell concern FDD, the others 
concern DD. Values in bold are obtained in the regime given by the upwind diffusion ball condition. 



100^ 

200^ 

400^ 

800^ 

N 


0.02 

0.01 

0.005 

0.0025 

Ax 


10-9 

0.18 (6) 
0.57 (52) 

0.62 (6) 
4.56 (102) 

3.39 (6) 

35.88 (202) 

21.53 (6) 

297.98 (402) 



10“® 

0.22 (9) 
0.64 (54) 

0.79 (10) 
4.61 (104) 

4.52 (12) 

36.52 (205) 

27.48 (14) 

302.47 (406) 



10-^ 

0.23 (10) 
0.68 (55) 

0.89 (12) 
4.72 (105) 

4.92 (14) 

37.53 (206) 

29.49 (17) 

304.59 (408) 



10-® 

0.26 (12) 
0.70 (56) 

0.98 (14) 
4.77 (107) 

5.63 (18) 

39.15 (209) 

34.76 (23) 

312.19 (413) 


£ 

10“® 

0.28 (14) 
0.71 (57) 

1.19 (18) 
4.84 (109) 

6.94 (25) 

41.95 (214) 

42.82 (35) 

326.93 (422) 



10-"* 

0.34 (18) 
0.79 (60) 

1.47 (25) 
5.08 (114) 

8.87 (37) 

44.18 (223) 

63.13 (62) 

349.71 (445) 



6.25 X 10-4 

0.43 (24) 
0.82 (64) 

2.08 (37) 
5.89 (123) 

13.65 (61) 
46.31 (243) 

244.54 (303) 
523.47 (700) 



1.25 X 10-® 

0.45 (27) 
0.84 (67) 

2.34 (43) 
6.17 (128) 

38.09 (194) 
72.27 (393) 

391.86 (508) 
670.94 (907) 



2.5 X 10-® 

0.51 (32) 
0.85 (69) 

6.54 (132) 
11.04 (232) 

63.74 (326) 
101.05 (525) 

710.77 (816) 
1160.61 (1216) 



5 X 10-® 

1.26 (96) 
1.66 (145) 

11.15 (223) 
16.31 (322) 

122.05 (528) 
188.43 (726) 

1863.65 (2180) 
2424.73 (2571) 



10-2 

2.20 (163) 
2.44 (212) 

18.83 (366) 
22.98 (463) 

308.05 (1459) 
386.64 (1649) 

5091.73 (5724) 
5979.71 (6081) 



ball condition (4.6) is completely clear and somehow impressive. Indeed, in this test 
/min = /max = 1 and T = 1, then it reads 


7 max — 1 and X — 1 
2e 


f min 


1 

= - < 
U! 


Ax 
1 + T 


e < 


Ax 

~T 


For the considered fine grids with Ax = 0.02, 0.01, 0.005, 0.0025 this gives the thresh¬ 
olds tas = 5 X 10“^, 2.5 X 10“^, 1.25 x 10“^, 6.25 x 10“'* respectively. We see that 
both the CPU time and the number of iterations jump when the diffusion coefficient 
reaches the corresponding threshold. In Figure [T^ we also show the number of it¬ 
erations as a function of e for the case Aa; = 0.02 with a large number of samples. 
This dramatic change of behavior is due to the fact that the dynamics is isotropic 
(its module does not depend on the control) and constant in speed (c(a:) = 1). It 
turns out that, in this special case, the upwind diffusion ball condition is not only a 
global but also a local condition, the same at each point of the domain. Then, for 
e = 5 X 10“^ this condition fails simultaneously everywhere, producing the observed 
jump. Note that also the standard DD method exhibits the same behavior across the 
threshold, but the performance of the PDD method is much better, and this depends, 
as expected, on the reordering of the nodes in the individual patches. 

In Table we report the results for the Zermelo navigation problem. It is easy 
to see that /mm = 1/6 and /max = 3/2, so that T = 9 and the upwind diffusion ball 
condition reads 


e < 


Ax 

■ 


For Ax = 0.02,0.01,0.005,0.0025 we get the thresholds tax = 1-6 x 10 ^,8.3 x 
10“^4.16 X 10 2.083 X 10 ® respectively. We still observe a raising in the number of 



























24 



(a) (b) 


Fig. 11. FDD vs DD in terms of number of iterations (as functions of e in logarithmic 
scale) for Ax = 0.02. Vertical dashed lines identify the thresholds given by the upwind diffusion ball 
condition: (a) the diffusion-eikonal equation, (b) the Zermelo navigation problem. 


iterations above the corresponding threshold, but in this case the effect is moderate, no 
jump is present (see Figure [T^). This depends on the fact that the upwind diffusion 
ball condition is here only a global condition, and the threshold estimates the worst 
case scenario of the first points where it fails. Moreover, this threshold is not sharp 
as before. Indeed, the value /min = 1/6 is attained for instance at {xi,X 2 ) = (1,1) 
for a = whereas the value /max = 3/2 is attained for a = Rgj^ in the limit 

|1C| 

X ^ 0. Locally the optimal dynamics can do better (and it does!), so that, over the 
threshold, we can still have a good performance of the FDD method, compared to 
the standard DD method. 

It is important to remark that this is a case where the fast-marching method fails, 
due to the strong anisotropy of the problem. It follows that the reordering of the nodes 
according to the increasing values of the coarse solution is only sub-optimal. Despite 
we still have an improvement with respect to the standard DD method which does 
not exploit any causality, this explains the different performance of the FDD method 
compared to the previous test. In particular we see that the number of iterations is 
much larger than that of the eikonal-diffusion equation and it remains unchanged for 
several orders of magnitude of e (see again Figure 11). This means that, even for 
e = 0, the dependency between the nodes is already involved and very far from the 
true causality expected in the continuous case for Aa; —> 0. 

Finally, we observe in both experiments that, as the diffusion coefficient increases, 
the FDD method becomes comparable with the DD method. Indeed, the diffusion 
starts mixing information in the whole domain, so that the causality is completely 
lost and the reordering turns out to be not only sub-optimal but also useless. In this 
situation the pre-computation step just becomes a time wasting procedure. 


6. Conclusions. We proposed a parallel algorithm for the numerical solution of 
a class of second order semi-linear equations coming from stochastic optimal control 
problems. The new method is based on a dynamic domain decomposition technique, 
and it is a non trivial extension of the patchy domain decomposition method, presented 
by the authors in [5] for first order Hamilton-Jacobi-Bellman equations. We presented 
a modified semi-Lagrangian local solver, in which we removed the self-dependency 
of the grid nodes induced by the interpolation. This feature, combined with fast- 
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Table 3 

The Zermelo navigation problem: FDD vs DD in terms of CPU time in seconds (and number 
of iterations) to reach convergence. Values in the first row of each cell concern FDD, the others 
concern DD. Values in bold are obtained in the regime given by the upwind diffusion ball condition. 



100^ 

200^ 

400^ 

800^ 

N 


0.02 

0.01 

0.005 

0.0025 

Ao; 


10-® 

2.94 (132) 
3.24 (191) 

16.20 (208) 
22.52 (325) 

96.68 (315) 
157.74 (577) 

642.71 (522) 
1146.91 (1045) 



10“® 

2.97 (132) 
3.25 (191) 

16.21 (208) 
22.53 (325) 

96.70 (315) 
157.75 (577) 

643.01 (522) 
1147.27 (1045) 



10-" 

2.98 (132) 
3.25 (191) 

16.25 (208) 
22.54 (325) 

97.14 (315) 
158.15 (577) 

645.27 (522) 
1148.35 (1046) 



10-6 

2.98 (132) 
3.26 (191) 

16.33 (209) 
22.57 (325) 

97.23 (316) 
158.36 (577) 

648.84 (523) 
1207.03 (1048) 


e 

10-6 

2.99 (132) 
3.26 (191) 

16.30 (209) 
22.60 (326) 

98.91 (318) 
159.97 (581) 

678.33 (533) 
1239.07 (1061) 



2.083 X 10-5 

3.01 (132) 
3.27 (191) 

16.41 (210) 
22.67 (327) 

98.95 (320) 
161.16 (583) 

694.57 (543) 

1264.94 (1075) 



4.16 X 10-5 

3.02 (133) 
3.30 (191) 

16.76 (212) 
23.28 (329) 

101.51 (326) 
172.55 (591) 

713.82 (571) 

1337.13 (1099) 



8.3 X 10-5 

3.05 (134) 
3.40 (192) 

16.95 (216) 
23.33 (333) 

106.20 (344) 
167.98 (604) 

814.97 (618) 

1405.86 (1149) 



1.6 X 10-* 

3.12 (138) 
3.35 (194) 

17.95 (227) 
25.45 (341) 

120.27 (389) 
184.91 (636) 

948.74 (749) 

1494.67 (1287) 



10-5 

4.08 (187) 
4.18 (232) 

27.24 (358) 
33.07 (467) 

242.76 (774) 
312.96 (996) 

3470.90 (2827) 
4214.57 (3261) 



5 X 10-5 

8.38 (418) 
8.76 (444) 

111.50 (1461) 
113.27 (1507) 

981.54 (3221) 
1093.78 (3360) 

14634.04 (12105) 
15286.98 (12348) 



marching-like techniques, allowed to accelerate the convergence of the method. More¬ 
over, we introduced the upwind diffusion ball condition, a geometric property that 
involves the discretization parameters and the data of the problem. It guarantees, 
despite the presence of a diffusion process, a qualitative behavior of hyperbolic type 
for the discretized equation. This translated into an additional acceleration for the 
proposed method. Several numerical experiments conhrmed the ability of the semi- 
Lagrangian scheme to deal with very degenerate diffusion terms and quite general 
nonlinar equations. Finally, we perfomed a comparison with a standard domain de¬ 
composition method. We showed that, under the regime of the upwind diffusion ball 
condition and for a sufficiently fine grid, the speedup of the new patchy domain decom¬ 
position method is remarkable, despite the pre-computation step and the transmission 
conditions. This is the main achievement of this work. 
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